predicting the categories of colon cancer using microarray data and nearest shrunken centroid
نویسندگان
چکیده
b a c k g r o u n d & aim: it is very helpful to classify and predict the clinical category of a sample based on its gene expression profile. this study was conducted to predict tissues of colorectal adenoma, adenocarcinoma, and paired normal in colon based on microarray data using nearest shrunken centroid method. methods & materials: in this study, the colon cancer dataset were used including, 18 adenocarcinoma, 4 colorectal adenoma, and 22 paired normal colon samples with 2360 common gene expression measurements. in order to predict categories of colon cancer was used nearest shrunken centroid method. r software was used for data analysis. r e s u lt s : based on our findings, performance of nearest shrunken centroid method was successful to reduce 2360 genes to a set of eleven genes containing rig, bigh3, gli3, homo sapiens guanylin, p78, 54kda, xbp-1, co-029, desmin, mlc-2, and hmg-1. this method predicted three classes. it predicted two classes- colorectal adenoma and adenocarcinoma with error of zero and normal class with error of 4.5%. c on c l u sio n : nearest shrunken centroid method succeeded to reduce several 1000 genes to 11 genes that were able to characterize colon tissue samples to one of the three classes of adenocarcinoma, colorectal adenoma and normal with 97.7% accuracy.
منابع مشابه
Nearest Shrunken Centroid as Feature Selection of Microarray Data
The nearest shrunken centroid classifier uses shrunken centroids as prototypes for each class and test samples are classified to belong to the class whose shrunken centroid is nearest to it. In our study, the nearest shrunken centroid classifier was used simply to select important genes prior to classification. Random Forest, a decision tree based classification algorithm, is chosen as a classi...
متن کاملBoosting nearest shrunken centroid classifier for microarray data
Nearest shrunken centroid classifier (NSC) is a class of linear classifiers with built-in feature selections, and has proven useful for analyzing microarray data. The simple linear structure of the classification boundary makes NSC easy to interpret and implement, but sometimes this simple structure might fail to generalize well for some data. In this paper we propose boosting NSC to improve it...
متن کاملImproved centroids estimation for the nearest shrunken centroid classifier
MOTIVATION The nearest shrunken centroid (NSC) method has been successfully applied in many DNA-microarray classification problems. The NSC uses 'shrunken' centroids as prototypes for each class and identifies subsets of genes that best characterize each class. Classification is then made to the nearest (shrunken) centroid. The NSC is very easy to implement and very easy to interpret, however, ...
متن کاملImproved nearest centroid classifier with shrunken distance measure for null LDA method on cancer classification problem
Null linear discriminant analysis (LDA) is a well-known dimensionality reduction technique for the small sample size problem. When the null LDA technique projects the samples to a lower dimensional space, the covariance matrices of individual classes become zero, i.e. all the projected vectors of a given class merge into a single vector. In this case, only the nearest centroid classifier (NCC) ...
متن کاملpredicting the survival time for bladder cancer using an addi-tive hazards model in microarray data
background: one substantial part of microarray studies is to predict patients’ survival based on their gene expression profile. variable selection techniques are powerful tools to handle high dimensionality in analysis of microarray data. however, these techniques have not been investigated in competing risks setting. this study aimed to investigate the performance of four sparse variable selec...
متن کاملDiagnosis of Breast Cancer Subtypes using the Selection of Effective Genes from Microarray Data
Introduction: Early diagnosis of breast cancer and the identification of effective genes are important issues in the treatment and survival of the patients. Gene expression data obtained using DNA microarray in combination with machine learning algorithms can provide new and intelligent methods for diagnosis of breast cancer. Methods: Data on the expression of 9216 genes from 84 patients across...
متن کاملمنابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
journal of biostatistics and epidemiologyجلد ۱، شماره ۱-۲، صفحات ۱۶-۲۱
میزبانی شده توسط پلتفرم ابری doprax.com
copyright © 2015-2023